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Abstract. Hereditarily finite (HF) set theory provides a standard uni¬ 
verse of sets, but with no infinite sets. Its utility is demonstrated through 
a formalisation of the theory of regular languages and finite automata, 
including the Myhill-Nerode theorem and Brzozowski’s minimisation al¬ 
gorithm. The states of an automaton are HF sets, possibly constructed 
by product, sum, powerset and similar operations. 

1 Introduction 

The theory of finite state machines is fundamental to computer science. It has 
applications to lexical analysis, hardware design and regular expression pat¬ 
tern matching. A regular language is one accepted by a finite state machine, 
or equivalently, one generated by a regular expression or a type-3 grammar [6j. 
Researchers have been formalising this theory for nearly three decades. 

A critical question is how to represent the states of a machine. Automata 
theory is developed using set-theoretic constructions, e.g. the product, disjoint 
sum or powerset of sets of states. But in a strongly-typed formalism such as 
higher-order logic (HOL), machines cannot be polymorphic in the type of states: 
statements such as “every regular language is accepted by a finite state machine” 
would require existential quantification over types. One might conclude that 
there is no good way to formalise automata in HOL [mu¬ 
lt turns out that finite automata theory can be formalised within the theory 
of hereditarily finite sets: set theory with the negation of the axiom of infinity. 
It admits the usual constructions, including lists, functions and integers, but no 
infinite sets. The type of HF sets can be constructed from the natural numbers 
within higher-order logic. Using HF sets, we can retain the textbook definitions, 
without ugly numeric coding. We can expect HF sets to find many other appli¬ 
cations when formalising theoretical computer science. 

The paper introduces HF set theory and automata (Sect. [2]). It presents a 
formalisation of deterministic finite automata and results such as the Myhill- 
Nerode theorem (Sect. [3]). It also treats nondeterministic finite automata and 
results such as the powerset construction and closure under regular expression 
operations (Sect. [I]). Next come minimal automata, their uniqueness up to iso¬ 
morphism, and Brzozowski’s algorithm for minimising an automaton [3] (Sect. [5]). 
The paper concludes after discussing related work (Sect.[6][7]). The proofs, which 
are available online [H;, also demonstrate the use of Isabelle’s locales [I]. 



2 Background 


An hereditarily finite set can be understood inductively as a finite set of hered¬ 
itarily finite sets |14j . This definition justifies the recursive definition f(x ) = 
E { 2/(y) I V £ x}, yielding a bijection / : HF —>• N between the HF sets and the 
natural numbers. The linear ordering on HF given by x < y •<=>■ fix) < f(y) 
can be shown to extend both the membership and the subset relations. 

The HF sets support many standard constructions, even quotients. Equiva¬ 
lence classes are not available in general — they may be infinite — but the linear 
ordering over HF identifies a unique representative. The integers and rationals 
can be constructed, with their operations (but not the set of integers, obviously). 
Swierczkowski [T4] has used HF as the basis for proving Godel’s incompleteness 
theorems, and I have formalised his work using Isabelle [ 13 - 

Let A be a nonempty, finite alphabet of symbols. Then S* is the set of words : 
finite sequences of symbols. The empty word is written e, and the concatenation 
of words u and v is written uv. A deterministic finite automaton (DFA) [Bl?j is 
a structure {K,£,S,qo,F) where K is a finite set of states, 6 : K x U —»• K is 
the next-state function, qo £ K is the initial state and F C K is the set of final 
or accepting states. The next-state function on symbols is extended to one on 
words, <5* : K x S* —> K such that S*(q, e) = q , 6*(q, a) = S(q, a) for a € E and 
5*(q,uv) = 5*{5*(q,u),v). The DFA accepts the string w if 5*(qo,w) £ F. A set 
L C S* is a regular language if L is the set of strings accepted by some DFA. 

A nondeterministic finite automaton (NFA) is similar, but admits multiple 
execution paths and accepts a string if one of them reaches a final state. Formally, 
an NFA is a structure (K, £,5,Qo, F) where 6 : K x E —>• V{K) is the next- 
state function, Qo C K a set of initial states, the other components as above. 
The next-state function is extended to <5* : V(K) x £* —> V{K) such that 
5*(Q,e) = Q, S*(Q,a) = \J q& Q S{q,a) for a £ E and 5*(Q,uv ) = S*(S*(Q,u),v). 
An NFA accepts the string w provided 5*(q,w ) £ F for some q £ Qq. 

The notion of NFA can be extended with e-transitions, allowing “silent” 
transitions between states. Define the transition relation q —> q' for qf £ 5(q, a). 
Let the e-transition relation q A q' be given. Then define the transition relation 
q q 1 to allow e-transitions before and after: (A-)* o (A) o (A)*. 

Every NFA can be transformed into a DFA, where the set of states is the 
powerset of the NFA’s states, and the next-state function captures the effect of 
q A q 1 on these sets of states. Regular languages are closed under intersection 
and complement, therefore also under union. They are closed under repetition 
(Kleene star). Two key results are discussed below: 

— The Myhill-Nerode theorem gives necessary and sufficient conditions for a 
language to be regular. It defines a canonical and minimal DFA for any given 
regular language. Minimal DFAs are unique up to isomorphism. 

— Reorienting the arrows of the transition relation transforms a DFA into an 
NFA accepting the reverse of the given language. We can regain a DFA using 
the powerset construction. Repeating this operation yields a minimal DFA 
for the original language. This is Brzozowski’s minimisation algorithm [3j. 


This work has been done using the proof assistant Isabelle/HOL. Documen¬ 
tation is available online at http://isabelle.in.tum.de/. The work refers to 
equivalence relations and equivalence classes, following the conventions estab¬ 
lished in my earlier paper m- If R is an equivalence relation on the set A , then 
A//R is the set of equivalence classes. If x£A, then its equivalence class is R‘ f {x}. 
Formally, it is the image of x under R: the set of all y such that (x,y) e R. More 
generally, if XCA then R “X is the union of the equivalence classes R ‘ r {x} for x£X. 

3 Deterministic Automata; the Myhill-Nerode Theorem 

When adopting HF set theory, there is the question of whether to use it for 
everything, or only where necessary. The set of states is finite, so it could be 
an HF set, and similarly for the set of final states. The alphabet could also be 
given by an HF set; then words— lists of symbols—would also be HF sets. Our 
definitions could be essentially typeless. 

The approach adopted here is less radical. It makes a minimal use of HF, 
allowing stronger type-checking, although this does cause complications else¬ 
where. Standard HOL sets (which are effectively predicates) are intermixed with 
HF sets. An HF set has type if, while a (possibly infinite) set of HF sets has 
type hf set. Definitions are polymorphic in the type ’a of alphabet symbols, 
while words have type 'a list. 

3.1 Basic Definition of DFAs 

The record definition below declares the components of a DFA. The types make 
it clear that there is indeed a set of states but only a single initial state, etc. 

record ’a dfa = states :: "hf set" 


init 

final 

nxt 


"hf" 

"hf set" 

"hf => ’a =>■ hf" 


Now we package up the axioms of the DFA as a locale .1]: 

locale dfa = 

fixes M :: " ’ a dfa " 
assumes init: "init M € states M" 
and final: "final M C states M" 

and nxt: "/\q x. q £ states M => nxt M q x £ states M" 

and finite: "finite (states M)" 

The last assumption is needed because the states field has type hf set and 
not hf. The locale bundles the assumptions above into a local context, where 
they are directly available. It is then easy to define the accepted language. 

primrec next 1 :: "hf =>■ ’a list => hf" where 
"next 1 q 0 = q" 

I "next 1 q (x#xs) = next 1 (nxt M q x) xs" 




definition language :: "(’a list) set" where 
"language = {xs. next1 (init M) xs £ final M}" 

Equivalence relations play a significant role below. The following relation regards 
two strings as equivalent if they take the machine to the same state [Up. 90], 

definition eq_next 1 :: "(’a list X ’a list) set" where 

"eqjnextl = {(u,v) . next 1 (init M) u = nextl (init M) v}" 

Note that language and eqjnextl take no arguments, but refer to the locale. 

3.2 Myhill-Nerode Relations 

The Myhill-Nerode theorem asserts the equivalence of three characterisations 
of regular languages. The first of these is to be the language accepted by some 
DFA. The other two are connected with certain equivalence relations, called 
Myhill-Nerode relations, on words of the language. 

The definitions below are outside of the locale and are therefore independent 
of any particular DFA. The predicate dfa refers to the locale axioms and ex¬ 
presses that its argument, M, is a DFA. The predicate dfa. language refers to the 
constant language: outside of the locale, it takes a DFA as an argument. 

definition regular :: "(’a list) set =>■ bool" where 
"regular L = 3 M. dfa M A dfa. language M = L" 

The other characterisations of a regular language involve abstract finite state 
machines derived from the language itself, with certain equivalence classes as the 
states. A relation is right invariant if it satisfies the following closure property. 

definition right_invariant :: "(’a list X ’a list) set => bool" where 
"right_invariant r = (V u v w. (u,v) £ r —>• (u@w, v@w) £ r)" 

The intuition is that if two words u and v are related, then each word brings the 
“machine” to the same state, and once this has happened, this agreement must 
continue no matter how the words are extended as u@w and v@w. 

A Myhill-Nerode relation for a language L is a right invariant equivalence re¬ 
lation of finite index where L is the union of some of the equivalence classes 
[7) p. 90]. Finite index means the set of equivalence classes is finite: finite 
(UNIV//R) 0 The equivalence classes will be the states of a finite state machine. 
The equality L = R‘ ‘A, where A C L is a set of words of the language, expresses 
L as the union of a set of equivalence classes, which will be the final states. 

definition MyhillNerode :: "’a list set (’a list * ’a list)set => bool" 

where "MyhillNerode L R = equiv UNIV R A right_invariant R A 

finite (UNIV//R) A (3 A. L = R“ A)" 

While eq_next 1 (defined in H3.1|> refers to a machine, the relation eq_app_right 
is defined in terms of a language, L . It relates the words u and v if all extensions 
of them, u@w and v@w 1 behave equally with respect to L: 

UNIV denotes a typed universal set, here the set of all words. 


l 



definition eq_app_right :: "'a list set => (’a list * ’a list) set" where 
"eq_app_right L = {(u,v). V w. u@w £ L <—> v@w £ L}" 

It is a Myhill-Nerode relation for L provided it is of finite index: 

lemma MiLeq_app_right : 

"finite (UNIV // eq_app_right L) =4> MyhillNerode L (eq_app_right L) " 

Moreover, every Myhill-Nerode relation if for L refines eq_app_right L. 

lemma MN_ref ines_eq_app_right : "MyhillNerode L R => R C eq_app_right L" 

This essentially states that eq.app_right L is the most abstract Myhill-Nerode 
relation for L. This will eventually yield a way of defining a minimal machine. 


3.3 The Myhill-Nerode Theorem 

The Myhill-Nerode theorem says that these three statements are equivalent [8]: 

1. The set L is a regular language (is accepted by some DFA). 

2. There exists some Myhill-Nerode relation R for L. 

3. The relation eq_app_right L has finite index. 

We have (1) => (2) because eq_nextl is a Myhill-Nerode relation. We have 
(2) => (3), by lemma JW_refines_eq_app_right, because every equivalence class for 
eq_app_right L is the union of equivalence classes of if, and so eq_app_right L has 
minimal index for all Myhill-Nerode relations. We get (3) =>- (1) by constructing 
a DFA whose states are the (finitely many) equivalence classes of eq_app_right 
L. This construction can be done for every Myhill-Nerode relation. 

Until now, all proofs have been routine. But now we face a difficulty: the 
states of our machine should be equivalence classes of words, but these could 
be infinite sets. What can be done? The solution adopted here is to map the 
equivalence classes to the natural numbers, which are easily embedded in HF. 
Proving that the set of equivalence classes is finite gives us such a map. 

Mapping infinite sets to integers seems to call into question the very idea 
of representing states by HF sets. However, mapping sets to integers turns out 
to be convenient only occasionally, and it is not necessary: we could formalise 
DFAs differently, coding symbols (and therefore words) as HF sets. Then we 
could represent states by representatives (having type hf) of equivalence classes. 
Using Isabelle’s type-class system to identify the types (integers, booleans, lists, 
etc.) that can be embedded into HF, type ’a dfa could still be polymorphic in 
the type of symbols. But the approach followed here is simpler. 

3.4 Constructing a DFA from a Myhill-Nerode Relation 

If fl is a Myhill-Nerode relation for a language L, then the set of equivalence 
classes is finite and yields a DFA for L. The construction is packaged as a locale, 
which is used once in the proof of the Myhill-Nerode theorem, and again to prove 


that minimal DFAs are unique. The locale includes not only L and R, but also 
the set A of accepting states, the cardinality n and the bijection h between the 
set UNIV//R of equivalence classes and the number n as represented in HF. The 
locale assumes the Myhill-Nerode conditions. 

locale MyhillNerode_dfa = 

fixes L :: "(’a list) set" and R :: "(’a list * ’a list) set" 
and A :: "(’a list) set" and n :: nat and h :: "(’a list) set => hf" 
assumes eqR: "equiv UNIV R" 

and riR: "right_invariant R" 
and L: "L = R“A" 

and h: "bij_betw h (UNIV//R) (hfset (ord_of n)) " 

The DFA is defined within the locale. The states are given by the equivalence 
classes. The initial state is the equivalence class for the empty word; the set of 
final states is derived from the set A of words that generate L; the next-state 
function maps the equivalence class for the word u to that for u@[x], Equiva¬ 
lence classes are not the actual states here, but are mapped to integers via the 
bijection h. As mentioned above, this use of integers is not essential. 

definition DFA :: "’a dfa" where 
"DFA = (| states = h ‘ (UNIV//R), 
ini t = h (R ‘ ‘ {[]}) , 
final = {h (R ‘‘ {u}) I u. u £ A}, 
nxt = Xq x. h C[J u £ h^ 1 q. R ‘ ‘ {u@[x]}) [) " 

This can be proved to be a DFA easily. One proof line, using the right-invariance 
property and lemmas about quotients El, proves that the next-state function 
respects the equivalence relation. Four more lines are needed to verify the proper¬ 
ties of a DFA, somewhat more to show that the language of this DFA is indeed L . 

The facts proved within the locale are summarised (outside its scope) by the 
following theorem, stating that every Myhill-Nerode relation yields an equivalent 
DFA. (The obtains form expresses existential and multiple conclusions.) 

theorem MN_imp_dfa: 

assumes "MyhillNerode L R" 

obtains M where "dfa M" "dfa. language M = L" 

"card (states M) = card (UNIV//R)" 

This completes the (3) (1) stage, by far the hardest, of the Myhill-Nerode 

theorem. The three stages are shown below. Lemma L2_3 includes a result about 
cardinality: the construction yields a minimal DFA, which will be useful later. 

lemma L1J2: "regular L => 3 R. MyhillNerode L R" 
lemma L2_3: 

assumes "MyhillNerode L R" 

obtains "finite (UNIV // eq_app_right L)" 

"card (UNIV // eq.app.right L) < card (UNIV // R)" 
lemma L3_l: "finite (UNIV // eq_app_right L) =$■ regular L" 


4 Nondeterministic Automata and Closure Proofs 


As most of the proofs are simple, our focus will be the use of HF sets when defin¬ 
ing automata. Our main example is the powerset construction for transforming 
a nondeterministic automaton into a deterministic one. 

4.1 Basic Definition of NFAs 

As in the deterministic case, a record holds the necessary components, while a 
locale encapsulates the axioms. Component eps deals with e-transitions. 

record ’a nfa = states :: "hf set" 

init :: "hf set" 

final :: "hf set" 

nxt :: "hf =>■ ’a => hf set" 

eps :: " (hf * hf) set" 

The axioms are obvious: the initial, final and next states belong to the set of 
states, which is finite. An axiom restricting e-transitions to machine states was 
removed, as it did not simplify proofs. Working with e-transitions is messy. It 
helps to provide special treatment for NFAs having no e-transitions. Allowing 
multiple initial states reduces the need for e-transitions. 

locale nfa = 

fixes M :: " ’ a nfa" 
assumes init: "init M C states M" 
and final: "final M C states M" 

and nxt: "/\q x. q £ states M =>■ nxt M q x C states M" 

and finite: "finite (states M) " 

The following function “closes up” a set Q of states under e-transitions. Inter¬ 
section with states M confines these transitions to legal states. 

definition epsclo :: "hf set =£> hf set" where 

"epsclo Q = states H fl (IJqSQ. {q’■ (q,q’) £ (eps M)*})" 

The remaining definitions are straightforward. Note that next 1 generalises nxt 
to take a set of states as well is a list of symbols. 

primrec next 1 :: "hf set => ’a list => hf set" where 
"next 1 Q [] = epsclo Q" 

I "next 1 Q (x#xs) = next 1 (|J q £ epsclo Q. nxt M q x) xs" 

definition language :: "(’a list) set" where 

"language = {xs. next 1 (init M) xs D final M ^ {}}" 

4.2 The Powerset Construction 

The construction of a DFA to simulate a given NFA is elementary, and is a good 
demonstration of the HF sets. The strongly-typed approach used here requires a 
pair of coercion functions hfset :: "hf => hf set" and HF :: "hf set => hf" 
to convert between HF sets and ordinary sets. 





lemma HFJifset: "HF (hfset a) = a" 

lemma hfset_HF: "finite A =>■ hfset (HF A) = A" 

With this approach, type-checking indicates whether we are dealing with a set 
of states or a single state. The drawback is that we occasionally have to show 
that a set of states is finite in the course of reasoning about the coercions, which 
would never be necessary if we confined our reasoning to the HF world. 

Here is the definition of the DFA. The states are e-closed subsets of NFA 
states, coerced to type hf. The initial and final states are defined similarly, while 
the next-state function requires both coercions and performs e-closure before 
and after. We work in locale nf a, with access to the components of the NFA. 

definition Power_dfa :: "’a dfa" where 
"Power_dfa = (dfa. states = HF ‘ epsclo ‘ Pow (states M), 
init = HF (epsclo(init M)), 

final = {HF(epsclo Q) I Q. Q C states M A Q fl final M ^ {}}, 
nxt = A Q x. HF((Jq £ epsclo (hfset Q). epsclo (nxt M q x)) D " 

Proving that this is a DFA is trivial. The hardest case is to show that the 
next-state function maps states to states. Proving that the two automata accept 
the same language is also simple, by reverse induction on lists (the induction 
step concerns u@[x], putting x at the end). Here, Power.language refers to the 
language of the powerset DFA, while language refers to that of the NFA. 

theorem Power _language: "Power. language = language" 

4.3 Other Closure Properties 

The set of languages accepted by some DFA is closed under complement, inter¬ 
section, concatenation, repetition (Kleene star), etc. [6j. Consider intersection: 

theorem regular_Int: 

assumes S: "regular S" and T: "regular T" shows "regular (S fl T) " 

The recognising DFA is created by forming the Cartesian product of the sets of 
states of MS and MT, the DFAs of the two languages. The machines are effectively 
run in parallel. The decision to represent a set of states by type hf set rather 
than by type hf means we cannot write dfa.states MS x dfa.states MT , but 
we can express this concept using set comprehension: 

"(states = {{ql,q2) / q 1 q2. q 1 £ dfa. states MS A q2 £ dfa. states MT}, 
init = (dfa.init MS, dfa.init MT), 

final = {(ql,q2) I q 1 q 2. q 1 £ dfa. final MS A q2 £ dfa. final MT}, 
nxt = \{qs,qt) x. (dfa.nxt MS qs x, dfa.nxt MT qt x}|)" 

This is trivially shown to be a DFA. Showing that it accepts the intersection of 
the given languages is again easy by reverse induction. 

Closure under concatenation is expressed as follows: 

theorem regular_conc: 

assumes S: "regular S" and T: "regular T" shows "regular (S @® T)" 


The concatenation is recognised by an NFA involving the disjoint sum of 
the sets of states of MS and MT , the DFAs of the two languages. The effect is 
to simulate the first machine until it accepts a string, then to transition to a 
simulation of the second machine. There are e-transitions linking every final 
state of MS to the initial state of MT. We again cannot write dfa.states MS + 
df a. states MT, but we can express the disjoint sum naturally enough: 

"d states = Ini 1 (df a. states MS) U Inr 1 (df a. states MT), 
init = {Ini (dfa.init MS)}, 
final = Inr ‘ (df a. final MT), 

nxt = Xq x. sum_case (Xqs. {Ini (dfa.nxt MS qs x)}) 

(Xqt. {Inr (dfa.nxt MT qt x)}) q, 
eps = (Xq. (Ini q, Inr (dfa.init MT))) ‘ dfa.final MS|)" 

Again, it is trivial to show that this is an NFA. But unusually, proving that it 
recognises the concatenation of the languages is a challenge. We need to show, 
by induction, that the “left part” of the NFA correctly simulates MS. 

have "/\q. Ini q G ST.nextl {Ini (dfa.init MS)} u <—» 
q = (dfa.nextl MS (dfa.init MS) u)" 

The key property is that any string accepted by the NFA can be split into strings 
accepted by the two DFAs. The proof involves a fairly messy induction. 

have "/\q. Inr q G ST.nextl {Ini (dfa.init MS)} u <—» 

CBuS uT. uS G dfa.language MS A u = uS@uT A 
q = dfa.nextl MT (dfa. init MT) uT) " 

Closure under Kleene star is not presented here, as it involves no interesting 
set operations. The language L* is recognised by an NFA with an extra state, 
which serves as the initial state and runs the DFA for L including iteration. The 
proofs are messy, with many cases. To their credit, Hopcroft and Ullman [B] give 
some details, while other authors content themselves with diagrams alone. 

5 State Minimisation for DFAs 

Given a regular language L, the Myhill-Nerode theorem yields a DFA having the 
minimum number of states. But it does not yield a minimisation algorithm for 
a given automaton. It turns out that a DFA is minimal if it has no unreachable 
states and if no two states are indistinguishable (in a sense made precise below). 
This again does not yield an algorithm. Brzozowski's minimisation algorithm 
involves reversing the DFA to create an NFA, converting back to a DFA via 
powersets, removing unreachable states, then repeating those steps to undo the 
reversal. Surprisingly, it performs well in practice [5]. 

5.1 The Left and Right Languages of a State 

The following developments are done within the locale dfa, and therefore refer 
to one particular deterministic finite automaton. 


w 

The left language of a state q is the set of all words w such that qo —»* q, or 
informally, such that the machine when started in the initial state and given the 
word w ends up in q. In a DFA, the left languages of distinct states are disjoint, 
if they are nonempty. 

definition left_lang :: "hf => (’a list) set" where 
"left_lang q = {u. nextl (init M) u = q}" 


w 

The right, language of a state q is the set of all words w such that q —>* qj, 
where < 7 / is a final state, or informally, such that the machine when started in q 
will accept the word w. The language of a DFA is the right language of q 0 . Two 
states having the same right language are indistinguishable: they both lead to 
the same words being accepted. 

definition right_lang :: "hf => (’a list) set" where 
"right_lang q = {u. nextl q u G final M}" 

The accessible states are those that can be reached by at least one word. 

definition accessible :: "hf set" where 
"accessible = {q. left_lang q ^ {}}" 

The function path_to returns one specific such word. This function will even¬ 
tually be used to express an isomorphism between any minimal DFA (one having 
no inaccessible or indistinguishable states) and the canonical DFA determined 
by the Myhill-Nerode theorem. 

definition path_to :: "hf => ’a list" where 
"path_to q = SOME u. u € left_lang q" 
lemma nextl_path_to : 

"q £ accessible => nextl (dfa.init M) (path_to q) = q" 

First, we deal with the problem of inaccessible states. It is easy to restrict 
any DFA to one having only accessible states. 

definition Accessible_dfa :: "’a dfa" where 
"Accessible_dfa = (|dfa. states = accessible, 
init = init M, 

final = final M [~l accessible, 
nxt = nxt M D " 

This construction is readily shown to be a DFA that agrees with the orig¬ 
inal in most respects. In particular, the two automata agree on left_lang and 
right_lang, and therefore on the language they accept: 

lemma Accessible_language: "Accessible. language = language" 

We can now define a DFA to be minimal if all states are accessible and no two 
states have the same right language. (The formula inj_on right_lang (dfa. states 
M) expresses that the function right_lang is injective on the set dfa. states M.) 



definition minimal where 

"minimal = accessible = states M A inj.on right.lang (dfa.states M)" 

Because we are working within the DFA locale, minimal is a constant referring 
to one particular automaton. 

5.2 A Collapsing Construction 

We can deal with indistinguishable states similarly, defining a DFA in which the 
indistinguishable states are identified via equivalence classes. This is not part 
of Brzozowski’s minimisation algorithm, but it is interesting in its own right: 
the equivalence classes themselves are HF sets. We begin by declaring a relation 
stating that two states are equivalent if they have the same right language. 

definition eq_right_lang : : " (hf X hf) set" where 

"eq_right_lang = {(u,v) . u £ states M A v £ states M A 

right.lang u = right.lang v}" 

Trivially, this is an equivalence relation, and equivalence classes of states are 
finite (there are only finitely many states). In the corresponding DFA, these 
equivalence classes form the states, with the initial and final states given by the 
equivalence classes for the corresponding states of the original DFA. As usual, 
the function HF is used to coerce a set of states to type hf. 

definition Collapse.dfa :: "’a dfa" where 

"Collapse.dfa = (]dfa. states = HF 1 (states M // eq.right_lang), 
init = HF (eq.right.lang ‘‘ {init M}), 
final = {HF (eq_right_lang ‘‘ {q}) I q. q £ final M}, 
nxt = XQ x. HF ("Uf £ hfset Q. eq_right_lang ‘‘ {nxt M q x}) [)" 

This is easily shown to be a DFA, and the next-state function respects the equiv¬ 
alence relation. Showing that it accepts the same language is straightforward. 

lemma ext.language_Collapse.dfa: 

"u £ Collapse. language <—> u £ language" 

5.3 The Uniqueness of Minimal DFAs 

The property minimal is true for machines having no inaccessible or indistin¬ 
guishable states. To prove that such a machine actually has a minimal number 
of states is tricky. It can be shown to be isomorphic to the canonical machine 
from the Myhill-Nerode theorem, which indeed has a minimal number of states. 

Automata M and N are isomorphic if there exists a bijection h between their 
state sets that preserves their initial, final and next states. This conception is 
nicely captured by a locale, taking the DFAs as parameters: 

locale dfa.isomorphism = M: dfa M + N: dfa N 

for M :: "’a dfa" and N : : "’a dfa" + 

fixes h : : "hf =$• hf" 



assumes h: "bij_betw h (states M) (states N)" 
and init : "h (init M) = init N" 
and final: "h ‘ final M = final 1V" 

and nxt : " /\q x. q G states M => h(nxt M q x) = nxt N (h q) x" 

With this concept at our disposal, we resume working within the locale dfa, 
which is concerned with the automaton M. If no two states have the same right 
language, then there is a bijection between the accessible states (of M) and the 
equivalence classes yielded by the relation eq_app_right language. 

lemma inj_right_lang_imp_eq_app_right_index: 
assumes "inj_on right Jiang (dfa.states M) " 

shows "bij_betw (\q. eq_app_right language “ {path_to q}) 
accessible (UNIV // eq_app_right language) " 

This bijection maps the state q to eq_app.right language ‘ ‘ {path_to q}. Every 
element of the quotient UNIV // eq_app_right language can be expressed in this 
form. And therefore, the number of states in a minimal machine equals the index 
of eq_app_right language. 

definition min_states where 

"min_states = card (UNIV // eq_app_right language) " 
lemma minimal_imp_index_eq_app_right: 

"minimal => card(df a. states M) = min_states" 

In the proof of the Myhill-Nerode theorem, it emerged that this index was 
the minimum cardinality for any DFA accepting the given language. Any other 
automaton, M’, accepting the same language cannot have fewer states. This the¬ 
orem justifies the claim that minimal indeed characterises a minimal DFA. 

theorem minimal_imp_card_states_le: 

"[minimal; dfa M’; dfa.language M’ = language] 

=> card (dfa.states M) < card (dfa. states M’)" 

Note that while the locale dfa gives us implicit access to one DFA, namely M, it 
is still possible to refer to other automata, as we see above. 

The minimal machine is unique up to isomorphism because every minimal 
machine is isomorphic to the canonical Myhill-Nerode DFA. The construction of 
a DFA from a Myhill-Nerode relation was packaged as a locale, and by applying 
this locale to the given language and the relation eq_app_right language , we can 
generate the instance we need. 

interpretation Canon: 

MyhillNerode_dfa language "eq_app_right language" 
language min_states index_f 

Here, index_f denotes some bijection between the equivalence classes and their 
cardinality (as an HF ordinal). It exists (definition omitted) by the definition 
of cardinality itself. It is the required isomorphism function between M and the 
canonical DFA of Sect. 13.41 which is written Canon.DFA. 



definition iso :: "hf => hf" where 

"iso = index_f o (Xq. eq_app_right language ‘‘ {path.to q})" 

The isomorphism property is stated using locale dfa_isomorphism. 

theorem minimal.imp.isomorphic.to.canonical: 

assumes minimal shows "dfa.isomorphism M Canon.DFA iso" 

Verifying the isomorphism conditions requires delicate reasoning. Hopcroft and 
Ullman’s proof [8] p. 29-30] provides just a few clues. 

5.4 Brzozowski’s Minimisation Algorithm 

At the core of this minimisation algorithm is an NFA obtained by reversing all 
the transitions of a given DFA, and exchanging the initial and final states. 

definition Reversemfa :: "’a dfa =>■ ’a nfa" where 
"Reverse.nfa MS = (\nf a. states = dfa. states MS, 
init = dfa.final MS, 
final = {dfa.init MS}, 

nxt = Xq x. {p £ dfa.states MS. q = dfa.nxt MS p x}, 
eps = {} [) " 

This is easily shown to be an NFA that accepts the reverse of every word accepted 
by the original DFA. Applying the powerset construction yields a new DFA that 
has no indistinguishable states. The point is that the right language of a powerset 
state is derived from the right languages of the constituent states of the reversal 
NFA [3]. Those, in turn, are the left languages of the original DFA, and these 
are disjoint (since the original DFA has no inaccessible states, by assumption). 

lemma inj.onnright_lang.PR: 

assumes "dfa. states M = accessible" 

shows "inj.on (dfa. right.lang (nfa.Power.dfa (Reverse.nfa M))) 

(dfa. states (nfa.Power.dfa (Reverse_nfa M))) " 

The following definitions abbreviate the steps of Brzozowski’s algorithm. 

abbreviation APR :: "’x dfa ’x dfa" where 

"APR X = dfa.Accessible.dfa (nfa.Power.dfa (Reverse_nfa X))" 
definition Brzozowski :: "’a dfa" where 
"Brzozowski = APR (APR M)" 

By the lemma proved just above, the APR operation yields minimal DFAs. 

theorem minimal_APR: 

assumes "dfa. states M = accessible" 
shows "dfa.minimal (APR M) " 

Brzozowski’s minimisation algorithm is correct. The first APR call reverses the 
language and eliminates inaccessible states; the second call yields a minimal 
machine for the original language. The proof uses the theorems just proved. 


theorem minimal_Brzozowski : "dfa.minimal Brzozowski" 
unfolding Brzozowski_def 
proof (rule dfa.minimal_APR) 
show "dfa (APR M)" 

by (simp add: dfa. dfa_Accessible nfa.dfa_Power nfa_Reverse_nfa) 
next 

show "dfa. states (APR M) = dfa. accessible (APR M)" 

by (s imp add: dfa.Accessible_accessible dfa.states_Accessible_dfa 
nf a. dfa_Power nfa_Reverse_nfa) 

qed 


6 Related Work 

There is a great body of prior work. One approach involves working construc¬ 
tively, in some sort of type theory. Constable’s group has formalised automata 
[tj in Nuprl, including the Myhill-Nerode theorem. Using type theory in the form 
of Coq and its Ssreflect library, Doczkal et al. [5] formalise much of the same ma¬ 
terial as the present paper. They omit e-transitions and Brzozowski’s algorithm 
and add the pumping lemma and Kleene’s algorithm for translating a DFA to 
a regular expression. Their development is of a similar length, under 1400 lines, 
and they allow the states of a finite automaton to be given by any finite type. In 
a substantial development, Braibant and Pous [2] have implemented a tactic for 
solving equations in Kleene algebras by implementing efficient finite automata 
algorithms in Coq. They represent states by integers. 

An early example of regular expression theory formalised using higher-order 
logic (Isabclle/HOL) is Nipkow’s verified lexical analyser [9]. His automata are 
polymorphic in the types of state and symbols. NFAs are included, with e- 
transitions simulated by an alphabet extended with a dummy symbol. 

Recent Isabelle developments explicitly bypass automata theory. Wu et al. 
m prove the Myhill-Nerode theorem using regular expressions. This is a signif¬ 
icant feat, especially considering that the theorem’s underlying intuitions come 
from automata. Current work on regular expression equivalence EES] continues 
to focus on regular expressions rather than finite automata. 

This paper describes not a project undertaken by a team, but a six-week 
case study by one person. Its successful outcome obviously reflects Isabelle’s 
powerful automation, but the key factor is the simplicity of the specifications. 
Finite automata cause complications in the prior work. The HF sets streamline 
the specifications and allow elementary set-theoretic reasoning. 

7 Conclusions 

The theory of finite automata can be developed straightforwardly using higher- 
order logic and HF set theory. We can formalise the textbook proofs: there is 
no need to shun automata or use constructive type theories. HF set theory can 
be seen as an abstract universe of computable objects, with many potential 



applications. One possibility is programming language semantics: using hf as 
the type of values offers open-ended possibilities, including integer, rational and 
floating point numbers, ASCII characters, and data structures. 

Acknowledgements. Christian Urban and Tobias Nipkow offered advice, and sug¬ 
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a variety of useful comments. 
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